http://lmarti.com; lmarti@ele.puc-rio.br
Programa de Verão 2015 - Laboratório Nacional de Computação Científica
For the slides we need to import some required modules:
import time, array, random, copy, math
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
It's "easy": we do it all the time.
$$ \renewcommand{\vec}[1]{\mathbf{#1}} \newcommand{\set}[1]{\mathcal{#1}} \newcommand{\dom}{\preccurlyeq} \begin{array}{rl} \mathrm{minimize} & \vec{F}(\vec{x})=\langle f_1(\vec{x}),\ldots,f_M(\vec{x})\rangle\,,\\ \mathrm{subject}\ \mathrm{to} & c_1(\vec{x}),\ldots,c_C(\vec{x})\le 0\,,\\ & d_1(\vec{x}),\ldots,d_D(\vec{x})= 0\,,\\ & \text{with}\ \vec{x}\in\set{D}\,, \end{array} $$
Note: In case you are -still- wondering, a maximization problem can be posed as the minimization one: $\min\ -\vec{F}(\vec{x})$.
Usually, there is not a unique solution that minimizes all objective functions simultaneously, but, instead, a set of equally good trade-off solutions.
$$ \set{A}^\ast=\left\{ \vec{x}\in\set{A} \left|\not\exists\vec{y}\in\set{A}:\vec{y}\dom\vec{x}\right.\right\}. $$
from deap import algorithms, base, benchmarks, tools, creator
Planting a constant seed to always have the same results (and avoid surprises in class). -you should not do this in a real-world case!
random.seed(a=42)
creator.create("FitnessMin", base.Fitness, weights=(-1.0,-1.0))
creator.create("Individual", array.array, typecode='d',
fitness=creator.FitnessMin)
$$ \begin{array}{rl} \text{minimize} & f_1(\vec{x}),f_2(\vec{x}) \\ \text{such that} & f_1(\vec{x}) = \frac{1}{2}\left( \sqrt{1 + (x_1 + x_2)^2} \sqrt{1 + (x_1 - x_2)^2} + x_1 -x_2\right) + d,\\ & f_2(\vec{x}) = \frac{1}{2}\left( \sqrt{1 + (x_1 + x_2)^2} \sqrt{1 + (x_1 - x_2)^2} - x_1 -x_2\right) + d,\\ \text{with}& d = \lambda e^{-\left(x_1-x_2\right)^2}\ (\text{generally }\lambda=0.85) \text{ and } \vec{x}\in \left[-1.5,1.5\right]^2. \end{array} $$
Implementing the Dent problem
def dent(individual, lbda = 0.85):
"""
Implements the test problem Dent
Num. variables = 2; bounds in [-1.5, 1.5]; num. objetives = 2.
@author Cesar Revelo
"""
d = lbda * math.exp(-(individual[0] - individual[1]) ** 2)
f1 = 0.5 * (math.sqrt(1 + (individual[0] + individual[1]) ** 2) + \
math.sqrt(1 + (individual[0] - individual[1]) ** 2) + \
individual[0] - individual[1]) + d
f2 = 0.5 * (math.sqrt(1 + (individual[0] + individual[1]) ** 2) + \
math.sqrt(1 + (individual[0] - individual[1]) ** 2) - \
individual[0] + individual[1]) + d
return f1, f2
Preparing a DEAP toolbox
with Dent.
toolbox = base.Toolbox()
BOUND_LOW, BOUND_UP = -1.5, 1.5
NDIM = 2
# toolbox.register("evaluate", lambda ind: benchmarks.dtlz2(ind, 2))
toolbox.register("evaluate", dent)
Defining attributes, individuals and population.
def uniform(low, up, size=None):
try:
return [random.uniform(a, b) for a, b in zip(low, up)]
except TypeError:
return [random.uniform(a, b) for a, b in zip([low] * size, [up] * size)]
toolbox.register("attr_float", uniform, BOUND_LOW, BOUND_UP, NDIM)
toolbox.register("individual", tools.initIterate, creator.Individual, toolbox.attr_float)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
Creating an example population distributed as a mesh.
num_samples = 50
limits = [np.arange(BOUND_LOW, BOUND_UP, (BOUND_UP - BOUND_LOW)/num_samples)] * NDIM
sample_x = np.meshgrid(*limits)
flat = []
for i in range(len(sample_x)):
x_i = sample_x[i]
flat.append(x_i.reshape(num_samples**NDIM))
example_pop = toolbox.population(n=num_samples**NDIM)
for i, ind in enumerate(example_pop):
for j in range(len(flat)):
ind[j] = flat[j][i]
fitnesses = toolbox.map(toolbox.evaluate, example_pop)
for ind, fit in zip(example_pop, fitnesses):
ind.fitness.values = fit
plt.figure(figsize=(10,4))
plt.subplot(1,2,1)
for ind in example_pop: plt.plot(ind[0], ind[1], 'k.', ms=3)
plt.xlabel('$x_1$');plt.ylabel('$x_2$');plt.title('Decision space');
plt.subplot(1,2,2)
for ind in example_pop: plt.plot(ind.fitness.values[0], ind.fitness.values[1], 'k.', ms=3)
plt.xlabel('$f_1(\mathbf{x})$');plt.ylabel('$f_2(\mathbf{x})$');
plt.xlim((0.5,3.6));plt.ylim((0.5,3.6)); plt.title('Objective space');
We also need a_given_individual
.
a_given_individual = toolbox.population(n=1)[0]
a_given_individual[0] = 0.5
a_given_individual[1] = 0.5
a_given_individual.fitness.values = toolbox.evaluate(a_given_individual)
Implementing the Pareto dominance relation between two individulas.
def pareto_dominance(ind1,ind2):
'Returns `True` if `ind1` dominates `ind2`.'
extrictly_better = False
for item1 in ind1.fitness.values:
for item2 in ind2.fitness.values:
if item1 > item2:
return False
if not extrictly_better and item1 < item2:
extrictly_better = True
return extrictly_better
Note: Bear in mind that DEAP implements a Pareto dominance relation that probably is more efficient than this implementation. The previous function would be something like:
def efficient_pareto_dominance(ind1, ind2):
return tools.emo.isDominated(ind1.fitness.values, ind2.fitness.values)
Lets compute the set of individuals that are dominated
by a_given_individual
, the ones that dominate it (its dominators
) and the remaining ones.
dominated = [ind for ind in example_pop if pareto_dominance(a_given_individual, ind)]
dominators = [ind for ind in example_pop if pareto_dominance(ind, a_given_individual)]
others = [ind for ind in example_pop if not ind in dominated and not ind in dominators]
def plot_dent():
'Plots the points in decision and objective spaces.'
plt.figure(figsize=(10,5))
plt.subplot(1,2,1)
for ind in dominators: plt.plot(ind[0], ind[1], 'r.')
for ind in dominated: plt.plot(ind[0], ind[1], 'g.')
for ind in others: plt.plot(ind[0], ind[1], 'k.', ms=3)
plt.plot(a_given_individual[0], a_given_individual[1], 'bo', ms=6);
plt.xlabel('$x_1$');plt.ylabel('$x_2$');
plt.title('Decision space');
plt.subplot(1,2,2)
for ind in dominators: plt.plot(ind.fitness.values[0], ind.fitness.values[1], 'r.', alpha=0.7)
for ind in dominated: plt.plot(ind.fitness.values[0], ind.fitness.values[1], 'g.', alpha=0.7)
for ind in others: plt.plot(ind.fitness.values[0], ind.fitness.values[1], 'k.', alpha=0.7, ms=3)
plt.plot(a_given_individual.fitness.values[0], a_given_individual.fitness.values[1], 'bo', ms=6);
plt.xlabel('$f_1(\mathbf{x})$');plt.ylabel('$f_2(\mathbf{x})$');
plt.xlim((0.5,3.6));plt.ylim((0.5,3.6));
plt.title('Objective space');
plt.tight_layout()
Having a_given_individual
(blue dot) we can now plot those that are dominated by it (in green), those that dominate it (in red) and those that are uncomparable.
plot_dent()
Obtaining the nondominated front.
non_dom = tools.sortNondominated(example_pop, k=len(example_pop),
first_front_only=True)[0]
plt.figure(figsize=(5,5))
for ind in example_pop:
plt.plot(ind.fitness.values[0], ind.fitness.values[1], 'k.', ms=3, alpha=0.5)
for ind in non_dom:
plt.plot(ind.fitness.values[0], ind.fitness.values[1], 'bo', alpha=0.74, ms=5)
An example, solving the TSP problem using brute force:
$n$ cities | time |
---|---|
10 | 3 secs |
12 | 3 secs × 12 × 11 = 6.6 mins |
14 | 6.6 mins × 13 × 14 = 20 hours |
24 | 3 secs × 24! / 10! = 16 billion years |
Note: See my PhD EC course notebooks https://github.com/lmarti/evolutionary-computation-course for a notebook on solving the TSP problem using evolutionary algorithms.
$$ \begin{array}{rl} \mathrm{minimize} & F(\vec{x})= w_1f_1(\vec{x})+\cdots + w_if_i(\vec{x}) +\cdots +w_Mf_M(\vec{x})\,,\\ \mathrm{subject}\ \mathrm{to} & c_1(\vec{x}),\ldots,c_C(\vec{x})\le 0\,,\\ & d_1(\vec{x}),\ldots,d_D(\vec{x})= 0\,,\\ & \text{with}\ \vec{x}\in\set{D}\,, \end{array} $$
All objectives are important......but some objectives are more important than others. |
Ideas:
This looks like the perfect task for an evolutionary algorithm.
Mating selection + Variation (Offsping generation) + Enviromental selection $\implies$ global + local parallel search features.
def evolutionary_algorithm():
populations = [] # a list with all the populations
populations[0] = initialize_population(pop_size)
t = 0
while not stop_criterion(populations[t]):
fitnesses = evaluate(populations[t])
offspring = matting_and_variation(populations[t],
fitnesses)
populations[t+1] = environmental_selection(
populations[t],
offspring)
t = t+1
Each of these categories store individuals that are only dominated by the elements of the previous categories, $$ \begin{array}{rl} \forall \vec{x}\in\set{F}_i: &\exists \vec{y}\in\set{F}_{i-1} \text{ such that } \vec{y}\dom\vec{x},\text{ and }\\ &\not\exists\vec{z}\in \set{P}_t\setminus\left( \set{F}_1\cup\ldots\cup\set{F}_{i-1} \right)\text{ that }\vec{z}\dom\vec{x}\,; \end{array} $$ with $\mathcal{F}_1$ equal to $\mathcal{P}_t^\ast$, the set of non-dominated individuals of $\mathcal{P}_t$.
After all individuals are ranked a local crowding distance is assigned to them.
The assignment process goes as follows,
Here the $\mathrm{sort}\left(\set{F},m\right)$ function produces an ordered index vector $\vec{I}$ with respect to objective function $m$.
Sorting the population by rank and distance.
Now we have key element of the the non-dominated sorting GA.
We will deal with DTLZ3, which is a more difficult test problem.
New toolbox
instance with the necessary components.
toolbox = base.Toolbox()
BOUND_LOW, BOUND_UP = 0.0, 1.0
toolbox.register("evaluate", lambda ind: benchmarks.dtlz3(ind, 2))
Describing attributes, individuals and population and defining the selection, mating and mutation operators.
toolbox.register("attr_float", uniform, BOUND_LOW, BOUND_UP, NDIM)
toolbox.register("individual", tools.initIterate, creator.Individual, toolbox.attr_float)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
toolbox.register("mate", tools.cxSimulatedBinaryBounded, low=BOUND_LOW, up=BOUND_UP, eta=20.0)
toolbox.register("mutate", tools.mutPolynomialBounded, low=BOUND_LOW, up=BOUND_UP, eta=20.0, indpb=1.0/NDIM)
toolbox.register("select", tools.selNSGA2)
Let's also use the toolbox
to store other configuration parameters of the algorithm. This will show itself usefull when performing massive experiments.
toolbox.pop_size = 50
toolbox.max_gen = 500
toolbox.mut_prob = 0.2
Storing all the required information in the toolbox
and using DEAP's algorithms.eaMuPlusLambda
function allows us to create a very compact -albeit not a 100% exact copy of the original- implementation of NSGA-II.
def nsga_ii(toolbox, stats=None, verbose=False):
pop = toolbox.population(n=toolbox.pop_size)
pop = toolbox.select(pop, len(pop))
return algorithms.eaMuPlusLambda(pop, toolbox, mu=toolbox.pop_size,
lambda_=toolbox.pop_size,
cxpb=1-toolbox.mut_prob,
mutpb=toolbox.mut_prob,
stats=stats,
ngen=toolbox.max_gen,
verbose=verbose)
We are now ready to run our NSGA-II.
%time res, logbook = nsga_ii(toolbox)
CPU times: user 37.6 s, sys: 296 ms, total: 37.9 s Wall time: 38.9 s
We can now get the Pareto fronts in the results (res
).
fronts = tools.emo.sortLogNondominated(res, len(res))
plot_colors = ('b','r', 'g', 'm', 'y', 'k', 'c')
fig, ax = plt.subplots(1, figsize=(4,4))
for i,inds in enumerate(fronts):
par = [toolbox.evaluate(ind) for ind in inds]
df = pd.DataFrame(par)
df.plot(ax=ax, kind='scatter', label='Front ' + str(i+1),
x=df.columns[0], y=df.columns[1],
color=plot_colors[i % len(plot_colors)])
plt.xlabel('$f_1(\mathbf{x})$');plt.ylabel('$f_2(\mathbf{x})$');
We create a stats
to store the individuals not only their objective function values.
stats = tools.Statistics()
stats.register("pop", copy.deepcopy)
toolbox.max_gen = 4000 # we need more generations!
Re-run the algorithm to get the data necessary for plotting.
%time res, logbook = nsga_ii(toolbox, stats=stats)
CPU times: user 46.5 s, sys: 199 ms, total: 46.7 s Wall time: 46.9 s
from JSAnimation import IPython_display
import matplotlib.colors as colors
from matplotlib import animation
def animate(frame_index, logbook):
'Updates all plots to match frame _i_ of the animation.'
ax.clear()
fronts = tools.emo.sortLogNondominated(logbook.select('pop')[frame_index],
len(logbook.select('pop')[frame_index]))
for i,inds in enumerate(fronts):
par = [toolbox.evaluate(ind) for ind in inds]
df = pd.DataFrame(par)
df.plot(ax=ax, kind='scatter', label='Front ' + str(i+1),
x=df.columns[0], y =df.columns[1], alpha=0.47,
color=plot_colors[i % len(plot_colors)])
ax.set_title('$t=$' + str(frame_index))
ax.set_xlabel('$f_1(\mathbf{x})$');ax.set_ylabel('$f_2(\mathbf{x})$')
return None
fig = plt.figure(figsize=(4,4))
ax = fig.gca()
anim = animation.FuncAnimation(fig, lambda i: animate(i, logbook),
frames=len(logbook), interval=60,
blit=True)
anim
The previous animation makes the notebook too big for online viewing. To circumvent this, it is better to save the animation as video and (manually) upload it to YouTube.
anim.save('nsgaii-dtlz3.mp4', fps=15, bitrate=-1, dpi=500)
from IPython.display import YouTubeVideo
YouTubeVideo('Cm7r4cJq59s')
Here it is clearly visible how the algorithm "jumps" from one local-optimum to a better one as evolution takes place.
Each problem instance is meant to test the algorithms with regard with a given feature: local optima, convexity, discontinuity, bias, or a combination of them.
DEAP still lacks some of these problems so I will grab the DTLZ5, DTLZ6 and DTLZ7 problems from inspyred and adapt them for DEAP.
Note to self: send these problems to the DEAP guys.
DTLZ5 and DTLZ6 have an $m-1$-dimensional Pareto-optimal front.
def dtlz5(ind, n_objs):
from functools import reduce
g = lambda x: sum([(a - 0.5)**2 for a in x])
gval = g(ind[n_objs-1:])
theta = lambda x: math.pi / (4.0 * (1 + gval)) * (1 + 2 * gval * x)
fit = [(1 + gval) * math.cos(math.pi / 2.0 * ind[0]) *
reduce(lambda x,y: x*y, [math.cos(theta(a)) for a in ind[1:]])]
for m in reversed(range(1, n_objs)):
if m == 1:
fit.append((1 + gval) * math.sin(math.pi / 2.0 * ind[0]))
else:
fit.append((1 + gval) * math.cos(math.pi / 2.0 * ind[0]) *
reduce(lambda x,y: x*y, [math.cos(theta(a)) for a in ind[1:m-1]], 1) *
math.sin(theta(ind[m-1])))
return fit
def dtlz6(ind, n_objs):
from functools import reduce
gval = sum([a**0.1 for a in ind[n_objs-1:]])
theta = lambda x: math.pi / (4.0 * (1 + gval)) * (1 + 2 * gval * x)
fit = [(1 + gval) * math.cos(math.pi / 2.0 * ind[0]) *
reduce(lambda x,y: x*y, [math.cos(theta(a)) for a in ind[1:]])]
for m in reversed(range(1, n_objs)):
if m == 1:
fit.append((1 + gval) * math.sin(math.pi / 2.0 * ind[0]))
else:
fit.append((1 + gval) * math.cos(math.pi / 2.0 * ind[0]) *
reduce(lambda x,y: x*y, [math.cos(theta(a)) for a in ind[1:m-1]], 1) *
math.sin(theta(ind[m-1])))
return fit
DTLZ7 has many disconnected Pareto-optimal fronts.
def dtlz7(ind, n_objs):
gval = 1 + 9.0 / len(ind[n_objs-1:]) * sum([a for a in ind[n_objs-1:]])
fit = [ind for ind in ind[:n_objs-1]]
fit.append((1 + gval) * (n_objs - sum([a / (1.0 + gval) * (1 + math.sin(3 * math.pi * a)) for a in ind[:n_objs-1]])))
return fit
How does our NSGA-II behaves when faced with different benchmark problems?
problem_instances = {'ZDT1': benchmarks.zdt1, 'ZDT2': benchmarks.zdt2,
'ZDT3': benchmarks.zdt3, 'ZDT4': benchmarks.zdt4,
'DTLZ1': lambda ind: benchmarks.dtlz1(ind,2),
'DTLZ2': lambda ind: benchmarks.dtlz2(ind,2),
'DTLZ3': lambda ind: benchmarks.dtlz3(ind,2),
'DTLZ4': lambda ind: benchmarks.dtlz4(ind,2, 100),
'DTLZ5': lambda ind: dtlz5(ind,2),
'DTLZ6': lambda ind: dtlz6(ind,2),
'DTLZ7': lambda ind: dtlz7(ind,2)}
toolbox.max_gen = 1000
stats = tools.Statistics(lambda ind: ind.fitness.values)
stats.register("obj_vals", np.copy)
def run_problem(toolbox, problem):
toolbox.register('evaluate', problem)
return nsga_ii(toolbox, stats=stats)
Running NSGA-II solving all problems. Now it takes longer.
%time results = {problem: run_problem(toolbox, problem_instances[problem]) \
for problem in problem_instances}
CPU times: user 1min 58s, sys: 348 ms, total: 1min 58s Wall time: 1min 59s
Creating this animation takes more programming effort.
class MultiProblemAnimation:
def init(self, fig, results):
self.results = results
self.axs = [fig.add_subplot(3,4,i+1) for i in range(len(results))]
self.plots =[]
for i, problem in enumerate(sorted(results)):
(res, logbook) = self.results[problem]
pop = pd.DataFrame(data=logbook.select('obj_vals')[0])
plot = self.axs[i].plot(pop[0], pop[1], 'b.', alpha=0.47)[0]
self.plots.append(plot)
fig.tight_layout()
def animate(self, t):
'Updates all plots to match frame _i_ of the animation.'
for i, problem in enumerate(sorted(results)):
#self.axs[i].clear()
(res, logbook) = self.results[problem]
pop = pd.DataFrame(data=logbook.select('obj_vals')[t])
self.plots[i].set_data(pop[0], pop[1])
self.axs[i].set_title(problem + '; $t=' + str(t)+'$')
self.axs[i].set_xlim((0, max(1,pop.max()[0])))
self.axs[i].set_ylim((0, max(1,pop.max()[1])))
return self.axs
mpa = MultiProblemAnimation()
fig = plt.figure(figsize=(14,6))
anim = animation.FuncAnimation(fig, mpa.animate, init_func=mpa.init(fig,results),
frames=toolbox.max_gen, interval=60, blit=True)
anim
Saving the animation as video and uploading it to YouTube.
anim.save('nsgaii-benchmarks.mp4', fps=15, bitrate=-1, dpi=500)
YouTubeVideo('8t-aWcpDH0U')
For a set of solutions $\set{A}$, $$ I_\mathrm{hyp}\left(\set{A}\right) = \mathrm{volume}\left( \bigcup_{\forall \vec{a}\in\set{A}}{\mathrm{hypercube}(\vec{a},\vec{r})}\right)\,. $$
Let's make a relatively simple experiment:
As usual we need to establish some notation:
toolbox
instances to define experiments. We start by creating a toolbox
that will contain the configuration that will be shared across all experiments.
toolbox = base.Toolbox()
BOUND_LOW, BOUND_UP = 0.0, 1.0
NDIM = 30
# the explanation of this... a few lines bellow
def eval_helper(ind):
return benchmarks.dtlz3(ind, 2)
toolbox.register("evaluate", eval_helper)
def uniform(low, up, size=None):
try:
return [random.uniform(a, b) for a, b in zip(low, up)]
except TypeError:
return [random.uniform(a, b) for a, b in zip([low] * size, [up] * size)]
toolbox.register("attr_float", uniform, BOUND_LOW, BOUND_UP, NDIM)
toolbox.register("individual", tools.initIterate, creator.Individual, toolbox.attr_float)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
toolbox.register("mate", tools.cxSimulatedBinaryBounded, low=BOUND_LOW, up=BOUND_UP, eta=20.0)
toolbox.register("mutate", tools.mutPolynomialBounded, low=BOUND_LOW, up=BOUND_UP, eta=20.0, indpb=1.0/NDIM)
toolbox.register("select", tools.selNSGA2)
toolbox.pop_size = 200
toolbox.max_gen = 500
We add a experiment_name
to toolbox
that we will fill up later on.
toolbox.experiment_name = "$P_\mathrm{mut}="
We can now replicate this toolbox instance and then modify the mutation probabilities.
mut_probs = (0.05, 0.15, 0.3)
number_of_experiments = len(mut_probs)
toolboxes=list([copy.copy(toolbox) for _ in range(number_of_experiments)])
Now toolboxes
is a list of copies of the same toolbox. One for each experiment configuration (population size).
...but we still have to set the population sizes in the elements of toolboxes
.
for i, toolbox in enumerate(toolboxes):
toolbox.mut_prob = mut_probs[i]
toolbox.experiment_name = toolbox.experiment_name + str(mut_probs[i]) +'$'
for toolbox in toolboxes:
print(toolbox.experiment_name, toolbox.mut_prob)
$P_\mathrm{mut}=0.05$ 0.05 $P_\mathrm{mut}=0.15$ 0.15 $P_\mathrm{mut}=0.3$ 0.3
As we are dealing with stochastic methods their results should be reported relying on an statistical analysis.
toolbox
instance in our case) should be repeated a sufficient amount of times. Note: I personally like the number 42 as it is the answer to The Ultimate Question of Life, the Universe, and Everything.
number_of_runs = 42
As we are now solving more demanding problems it would be nice to make our algorithms to run in parallel and profit from modern multi-core CPUs.
map()
function throu the toolbox
.multiprocessing
or concurrent.futures
modules.Note: You can have a very good summary about this issue in http://blog.liang2.tw/2014-handy-dist-computing/.
from IPython.html import widgets
from IPython.display import display
progress_bar = widgets.IntProgressWidget(description="Starting...",
max=len(toolboxes)*number_of_runs)
Process-based parallelization based on multiprocessing
requires that the parameters passed to map()
be pickleable.
lambda
functions can not be directly used. lambda
fans out there! -me included.def run_algo_wrapper(toolboox):
result,a = nsga_ii(toolbox)
pareto_sets = tools.emo.sortLogNondominated(result, len(result))
return pareto_sets[0]
This is not a perfect implementation but... works!
%%time
from multiprocessing import Pool
display(progress_bar)
results = {}
pool = Pool()
for toolbox in toolboxes:
results[toolbox.experiment_name] = pool.map(run_algo_wrapper, [toolbox] * number_of_runs)
progress_bar.value +=number_of_runs
progress_bar.description = "Finished %03d of %03d:" % (progress_bar.value, progress_bar.max)
As you can see, even this relatively small experiment took lots of time!
As running the experiments takes so long, lets save the results so we can use them whenever we want.
import pickle
pickle.dump(results, open('nsga_ii_dtlz3-results.pickle', 'wb'))
In case you need it, this file is included in the github repository.
To load the results we would just have to:
loaded_results = pickle.load(open('nsga_ii_dtlz3-results.pickle', 'rb'))
results = loaded_results # <-- uncomment if needed
results
is a dictionary, but a pandas DataFrame
is a more handy container for the results.
res = pd.DataFrame(results)
res.head()
$P_\mathrm{mut}=0.05$ | $P_\mathrm{mut}=0.15$ | $P_\mathrm{mut}=0.3$ | |
---|---|---|---|
0 | [[1.0, 0.5000269727348605, 0.4999616400061205,... | [[0.9999999999999867, 0.4001150913352416, 0.40... | [[0.9999999999999996, 0.5000103341990221, 0.59... |
1 | [[0.9999999999999996, 0.5000347245740564, 0.50... | [[0.9999999999999994, 0.3000864089066496, 0.50... | [[0.9999999999999021, 0.4004161957652541, 0.49... |
2 | [[0.999999999813255, 0.5999721288215643, 0.499... | [[0.9999999999999981, 0.4995172500410508, 0.49... | [[0.9999999999999997, 0.4001590675284601, 0.50... |
3 | [[0.9999999998399098, 0.4998310912965175, 0.49... | [[0.9999999999415495, 0.5000078089803229, 0.39... | [[0.9999999999999546, 0.5000090989133338, 0.39... |
4 | [[0.9999999999999997, 0.5011132736209546, 0.50... | [[0.9999999999997499, 0.50003632075969, 0.5001... | [[1.0, 0.49942156718161346, 0.4998754336732803... |
a = res.applymap(lambda pop: [toolbox.evaluate(ind) for ind in pop])
plt.figure(figsize=(11,3))
for i, col in enumerate(a.columns):
plt.subplot(1, len(a.columns), i+1)
for pop in a[col]:
x = pd.DataFrame(data=pop)
plt.scatter(x[0], x[1], marker='.', alpha=0.5)
plt.title(col)
The local Pareto-optimal fronts are clearly visible!
Calculating the reference point: a point that is “worst” than any other individual in every objective.
def calculate_reference(results, epsilon=0.1):
alldata = np.concatenate(np.concatenate(results.values))
obj_vals = [toolbox.evaluate(ind) for ind in alldata]
return np.max(obj_vals, axis=0) + epsilon
reference = calculate_reference(res)
reference
array([ 24.2078374 , 23.34556811])
We can now compute the hypervolume of the Pareto-optimal fronts yielded by each algorithm run.
import deap.benchmarks.tools as bt
hypervols = res.applymap(lambda pop: bt.hypervolume(pop, reference))
hypervols.head()
$P_\mathrm{mut}=0.05$ | $P_\mathrm{mut}=0.15$ | $P_\mathrm{mut}=0.3$ | |
---|---|---|---|
0 | 484.044706 | 475.835606 | 467.011099 |
1 | 470.434293 | 448.911864 | 505.030486 |
2 | 497.999042 | 543.511069 | 466.046722 |
3 | 498.620375 | 457.577314 | 511.391365 |
4 | 496.731738 | 537.063814 | 422.195642 |
hypervols.describe()
$P_\mathrm{mut}=0.05$ | $P_\mathrm{mut}=0.15$ | $P_\mathrm{mut}=0.3$ | |
---|---|---|---|
count | 42.000000 | 42.000000 | 42.000000 |
mean | 467.170450 | 449.429266 | 462.117480 |
std | 85.074852 | 67.852181 | 68.171960 |
min | 215.955445 | 266.283832 | 141.399743 |
25% | 451.705130 | 412.557156 | 441.953739 |
50% | 498.309709 | 454.553244 | 467.093601 |
75% | 520.230539 | 500.134986 | 501.287171 |
max | 557.024453 | 544.707224 | 551.328320 |
import seaborn
seaborn.set(style="whitegrid")
fig = plt.figure(figsize=(15,3))
plt.subplot(1,2,1, title='Violin plots of NSGA-II with $P_{\mathrm{mut}}$')
seaborn.violinplot(hypervols, alpha=0.74)
plt.ylabel('Hypervolume'); plt.xlabel('Mutation probabilities')
plt.subplot(1,2,2, title='Box plots of NSGA-II with $P_{\mathrm{mut}}$')
seaborn.boxplot(hypervols, alpha=0.74)
plt.ylabel('Hypervolume'); plt.xlabel('Mutation probabilities');
We start by writing a function that helps us tabulate the results of the application of an statistical hypothesis test.
import itertools
import scipy.stats as stats
def compute_stat_matrix(data, stat_func, alpha=0.05):
'''A function that applies `stat_func` to all combinations of columns in `data`.
Returns a squared matrix with the p-values'''
p_values = pd.DataFrame(columns=data.columns, index=data.columns)
for a,b in itertools.combinations(data.columns,2):
s,p = stat_func(data[a], data[b])
p_values[a].ix[b] = p
p_values[b].ix[a] = p
return p_values
The Kruskal-Wallis H-test tests the null hypothesis that the population median of all of the groups are equal.
stats.kruskal(*[hypervols[col] for col in hypervols.columns])
(4.1170925062938863, 0.12763939044708175)
We now can assert that the results are not the same but which ones are different or similar to the others the others?
In case that the null hypothesis of the Kruskal-Wallis is rejected the Conover–Inman procedure (Conover, 1999, pp. 288-290) can be applied in a pairwise manner in order to determine if the results of one algorithm were significantly better than those of the other.
Note: If you want to get an extended summary of this method check out my PhD thesis.
def conover_inman_procedure(data, alpha=0.05):
num_runs = len(data)
num_algos = len(data.columns)
N = num_runs*num_algos
_,p_value = stats.kruskal(*[data[col] for col in data.columns])
ranked = stats.rankdata(np.concatenate([data[col] for col in data.columns]))
ranksums = []
for i in range(num_algos):
ranksums.append(np.sum(ranked[num_runs*i:num_runs*(i+1)]))
S_sq = (np.sum(ranked**2) - N*((N+1)**2)/4)/(N-1)
right_side = stats.t.cdf(1-(alpha/2), N-num_algos) * \
math.sqrt((S_sq*((N-1-p_value)/(N-1)))*2/num_runs)
res = pd.DataFrame(columns=data.columns, index=data.columns)
for i,j in itertools.combinations(np.arange(num_algos),2):
res[res.columns[i]].ix[j] = abs(ranksums[i] - ranksums[j]/num_runs) > right_side
res[res.columns[j]].ix[i] = abs(ranksums[i] - ranksums[j]/num_runs) > right_side
return res
conover_inman_procedure(hypervols)
$P_\mathrm{mut}=0.05$ | $P_\mathrm{mut}=0.15$ | $P_\mathrm{mut}=0.3$ | |
---|---|---|---|
$P_\mathrm{mut}=0.05$ | NaN | True | True |
$P_\mathrm{mut}=0.15$ | True | NaN | True |
$P_\mathrm{mut}=0.3$ | True | True | NaN |
We now know in what cases the difference is sufficient as to say that one result is better than the other.
Another alternative is the Friedman test.
hyp_transp = hypervols.transpose()
measurements = [list(hyp_transp[col]) for col in hyp_transp.columns]
stats.friedmanchisquare(*measurements)
(41.4318936877076, 0.45177834097859693)
Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test (WRS), or Wilcoxon–Mann–Whitney test) is a nonparametric test of the null hypothesis that two populations are the same against an alternative hypothesis, especially that a particular population tends to have larger values than the other.
It has greater efficiency than the $t$-test on non-normal distributions, such as a mixture of normal distributions, and it is nearly as efficient as the $t$-test on normal distributions.
raw_p_values=compute_stat_matrix(hypervols, stats.mannwhitneyu)
raw_p_values
$P_\mathrm{mut}=0.05$ | $P_\mathrm{mut}=0.15$ | $P_\mathrm{mut}=0.3$ | |
---|---|---|---|
$P_\mathrm{mut}=0.05$ | NaN | 0.03572306 | 0.08074789 |
$P_\mathrm{mut}=0.15$ | 0.03572306 | NaN | 0.1549628 |
$P_\mathrm{mut}=0.3$ | 0.08074789 | 0.1549628 | NaN |
The familywise error rate (FWER) is the probability of making one or more false discoveries, or type I errors, among all the hypotheses when performing multiple hypotheses tests.
Example: When performing a test, there is a $\alpha$ chance of making a type I error. If we make $m$ tests, then the probability of making one type I error is $m\alpha$. Therefore, if an $\alpha=0.05$ is used and 5 pairwise comparisons are made, we will have a $5\times0.05 = 0.25$ chance of making a type I error.
One of these corrections is the Šidák correction as it is less conservative than the Bonferroni correction: $$\alpha_{SID} = 1-(1-\alpha)^\frac{1}{m},$$ where $m$ is the number of tests.
from scipy.misc import comb
alpha=0.05
alpha_sid = 1 - (1-alpha)**(1/comb(len(hypervols.columns), 2))
alpha_sid
0.016952427508441503
Let's apply the corrected alpha to raw_p_values
. If we have a cell with a True
value that means that those two results are the same.
raw_p_values.applymap(lambda value: value <= alpha_sid)
$P_\mathrm{mut}=0.05$ | $P_\mathrm{mut}=0.15$ | $P_\mathrm{mut}=0.3$ | |
---|---|---|---|
$P_\mathrm{mut}=0.05$ | False | False | False |
$P_\mathrm{mut}=0.15$ | False | False | False |
$P_\mathrm{mut}=0.3$ | False | False | False |
Check out http://simco.gforge.inria.fr/doku.php?id=openproblems
In this class/notebook we have seen some key elements:
Bear in mind that:
What happens if we substitute crowding distance with other 'secondary' selection criterion?
This IPython notebook can be found at the talk github repository:
This notebook is better viewed rendered as slides. You can convert it to slides and view them by:
$ ipython nbconvert --to slides --post serve <this-notebook-name.ipynb>